-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testnet4 including PoW difficulty adjustment fix #29775
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK |
When resetting a test chain, it is also important to consider the script interpreter coverage of the current chain. Test chains are (usually) the first place to go to, to test new script primitives and protocols, as well as consensus deployments. The existing chain thus serves as a test for consensus implementations, apart from basic unit test vectors. It would be good to think how to preserve the test vectors in the chain. See also #11739 (comment) . Or if it is not needed, it would be good to say so. Maybe https://github.com/bitcoin-core/qa-assets/blob/main/unit_test_data/script_assets_test.json already covers a good portion? Moreover, testnet is the only public chain where anyone can submit a nonstandard transaction from their laptop. Recall that policy is enforced on all networks equally (see commit e1dc15d), so getting a non-mempool transaction into a block is only possible for a miner, or by cooperating with a miner. So if the difficulty hack is removed completely, anyone wishing to submit a transaction would have to go purchase and set up mining hardware, or find a miner willing to accept the transaction. Not saying what is the best approach here, just saying that the effects should be considered and a change be done intentionally. |
Probably we should support tracking both testnet3 and the new testnet4 for some time. Making the new code conditional on a different chain param that's only set for testnet4 would probably be the easiest way of accomplishing that? |
61e51d1
to
7fbd4c6
Compare
Pushed some improvements and addressed some feedback. I am experimenting with some of the proposals from the mailing list and so I added Andres Poelstra's suggested difficulty adjustment with 6h/1M from here: https://groups.google.com/g/bitcoindev/c/9bL00vRj7OU/m/kFPaQCzmBwAJ
Updated the code to introduce T4 ( I am using the Genesis block hash to distinguish between the two testnets. There may be cleaner solutions but I think this is ok since it would be only temporary until T3 is removed.
Is it really realistic that someone with just their CPU would be able to mine a block with their non-standard tx on the current testnet? If the bug isn't active currently they would need to wait for it to become active and that could take weeks, right? And when it becomes active I would imagine the miner who found the first block in the difficulty=1 series just blasts the network and the CPU miner still has no chance to get a block in between. We could revert #28354 for testnet4 if this is a feature that matters to users. Is it too much to ask that people use
Interesting thought. I think once there is consensus to do T4 we will find a creative solution for this. Cool would be to convert this coverage to fuzzing coverage somehow but I am not sure if that's realistic or worth the effort. Otherwise, we could write a program that looks at all the different scripts that exist on T3 and replays them on T4 or if we can compress them somehow like by filtering everything that doesn't add coverage, then we turn it into a unit test that replays the interesting scripts. |
7fbd4c6
to
aaed0d3
Compare
Since some people consider the blockstorms an interesting feature of Testnet3, it might be interesting to only raise the difficulty of the delayed block exception to 100,000 instead of 1,000,000. This would allow the network to return to the organic difficulty in fewer difficulty periods and slow down the blockstorms but not remove the feature altogether. My understanding is that this would correspond roughly a tenth of one S9 mining on the network, so if no one had mined for a while, a single S9 could restart the network with ~60 s blocks, but wouldn’t churn out thousands of blocks per second. Only allowing lower difficulty blocks after 6 hours could easily make testnet useless for extended periods, if someone put several ASICs on testnet for a while, it might prevent other users from getting confirmations for up to 6 hours. I could see an increase from the twenty minute rule to maybe an hour, but more seems counter to why the rule was introduced in the first place. |
Yes, I am not sure what would be the problem. All you have to do is to set the time +20min and mine a block on your laptop. If you don't want to try it yourself, you can come by to watch it on my laptop.
After a quick chat with @murchandamus, an alternative fix would be to require the pre-retarget block to have the "correct" difficulty, so that all retarget periods are organic. The +20min hack would remain to allow a CPU to mine a few blocks, if needed, however, a block storm would be naturally limited by the +120h cut-off rule. This would limit the block storms to small block "gusts", which seems good enough to make everyone happy? |
I spun up When running with If anyone wants to deploy a faucet, let me know and I'll send some coins... unless someone reorgs me. |
This seems too complicated for a testnet exception IMO. And it breaks the use case of someone testing being able to mine a block on-demand without actual mining hardware. Shouldn't it be enough to just fix the timewarp bug? |
I doubt many people do that. You can still set |
Two people raised the concern in this thread, so why would you doubt it? |
I missed that response, so if this is possible at any time with or without a block storm happening, I am not sure how the change here is making a difference? I will give it a try. |
"Many" is very relative but I think we probably would not see a market for trading testnet coins against bitcoin if that is something everyone can do as easily as setting a bitcore core node for example. |
If it depends on the difficulty being 1 rather 1 million, that would make a difference. The two people who brought it up can definitely recompile, but maybe there's a better solution - maybe just a startup flag to override the minimum difficulty? |
I don't think consensus rules of remote nodes can be affected by a local startup flag (or re-compilation). If someone wanted to create a block locally only, they could use regtest. |
There are several BIPs that contain specifications relating to testnet, so perhaps a BIP is the right place to define testnet4? The BIP process predates testnet3, but only by a few months, so I don't think we should see the lack of a testnet3 BIP as an argument against this. |
Not even a BIP but some document that specifies testnet4 besides just a PR that still might get changed. I think in 2024 we can agree that there's more than just Bitcoin Core and asking other implementations to "read the Bitcoin Core codebase" is a ridiculous ask. |
Here is a BIP PR for Testnet 4: bitcoin/bips#1601 I think the written specification needs to be a BIP to be considered meaningful in the long run. If I just put it in a gist or something like that it depends on me alone to make changes should they become necessary for example. I would rather have it be managed by the community if the written specification is what people turn to. The PR still might get changed obviously but I will update the BIP PR accordingly. |
06c2c71
to
86fea43
Compare
I'm pouring one out for all the tACKs we've lost but the rebase was necessary for a possible merge. I have addressed the comments from @Sjors and I think those were all that are in scope for this PR here. Mostly it's adding comments and two small code simplifications.
I think the chain replay idea from @TheBlueMatt is probably best tracked in a separate issue. Potentially there are already projects out there that can provide the necessary functionality, I am not aware of anything like that though. |
The CI failure doesn't seem related, somehow the test-each-commit job was instantly cancelled. |
I forgot to address this comment from @Sjors earlier: I think it's an interesting idea but I am not sure about the adverse effects this could have. The base case for the network should be that we have a fluctuating but somewhat stable hashrate and a few people will use the 20-min rule to get their non-standard txs in or just to get some coins. How many these will be of the 2016 blocks, I don't know. Let's say it's 100 20-min blocks and they are always mined instantly (no time wasted; I am ignoring real 20-min blocks of which we will certainly also see a few). Then in this state when difficulty should be adjusted up (because the 100 blocks came fast) but instead the difficulty would be adjusted down because of the adjustment. The worst case I think is that someone tries to get as many 20-min blocks as possible constantly and with that grinds down the difficulty so that we will have a much faster block time than 10 minutes. I didn't do the exact math on it to see where we end up in an equilibrium in that case but I think this would pretty annoying. I think even if there is no attack we would end up with a faster block time on average. Maybe we don't want to assume any stability as the base case but I am still not sure the upside outweighs the downside on this. I will keep thinking about it more if I can think of more edge cases where this might lead to different outcomes than we want. |
I missed that before, but the improved text on the BIP made me realize. It looks like an attacker could jack up the difficulty by mining a few difficulty periods with an ASIC and then stop after the last block in a difficulty period. The network would then be on a difficulty some 4n higher than before, and stuck looking to mine a first block at full difficulty. I previously understood that for the difficulty it just goes back to the latest block that has a non-1 difficulty, and didn’t realize that the first block would need to be mined at full difficulty. Is there a way to prevent the reset to minimum difficulty while allowing the 20-minute exception for every block? Would that require something like @ajtowns’s storing the actual difficulty in the version or similar? |
Tested ACK 86fea43 |
Yes, this has also been described here by AJ. @Sjors suggestion to change |
I must have read that before it was edited and missed the edit.
Thanks for keeping the overview! :) |
I don't think that's much of a concern -- all you'd need to do is invalidateblock the last block of the period, mine a new one with a much later timestamp, and then mine another block in the new period, that no longer has a 4x increased difficulty. At that point your new chain has more work than the old chain, and you continue from there with normal difficulty. An attacker with 50x more hashpower than everyone else combined could conceivably rush 2015 blocks in ~6 hours, leaving the chain stalled for two weeks, and potentially repeat that attack as often as they liked, but that's probably a fair amount of hashpower to dedicate to griefing testnet, at which point switching to signet or spinning up testnet5 would presumably make sense. |
It also seems to me that its not crazy to have a testnet-specific validity rule that isn't "the checkpoint code" :). |
No it wasn't, it seems I was confused myself and thought there was no problem: #29775 (comment) |
I think that could reduce the difficulty by up to a factor of 16 (if you are willing to wait up to eight weeks), but I don’t see how someone needing to manually intervene and most likely still needing an ASIC mitigates the potential liveness issue here.
I’m not concerned with an attacker that has 50× more hashpower. If we were in a situation where Testnet has no ASICs mining it and someone points a millionth of the mainnet hashpower at Testnet (today about 650 TH/s), they could easily mine 10 difficulty periods in minutes and put the first block of the next difficulty period well out of the range of non-ASICs. If someone points more hashrate at it, e.g. in the range of a thousandth of mainnet, it could easily shoot up the difficulty even out of the range of a small number of S9s, which nominally have 14 TH/s. As far as I recall, that’s exactly the problem we had with Testnet 1 and 2 that lead to the 20-minute exception being introduced in the first place: mining pools occasionally test their setups on Testnet and cause the difficulty to shoot up like crazy. |
This comment was marked as off-topic.
This comment was marked as off-topic.
The hashpower on Testnet3 seems to have been fluctuating around ~500 TH/s over the past 2 years: https://mempool.space/testnet/graphs/mining/hashrate-difficulty If that data is correct, I am not so concerned about 650 TH/s but I am not sure if the 20-min exception blocks mess with those statistics. Why do you think this issue has never happened on Testnet3? Someone can run up the difficulty there today like you describe and leave on the last block of a difficulty adjustment period, the chain would stall the same as with the code here. I wrote a little script to check how many adjustment blocks took longer than 20min and what the most extreme cases were. It doesn't look too bad honestly, over these 13 years we have had a handful that took more than an hour, two over 5h, but I would have expected to find worse. The delta is just comparing the timestamp to the previous block, so granted, there could be some shenanigans going on and that's not the real delta but even if the prev block actually had a timestamp 2 hours in the future for example, it still doesn't seem too terrible to me to have these few outliers over 13 years. See logs
|
To supplement the ongoing conceptual discussion about a testnet reset I have drafted a move to v4 including a fix to the difficulty adjustment mechanism, which was part of the motivation that started the discussion.
Conceptual considerations:
CalculateNextWorkRequired
function and uses the same logic used inGetNextWorkRequired
to find the last previous block that was not mined with difficulty 1 under the exceptionf. An alternative fix briefly mentioned on the mailing list by Jameson Lopp would be to "restrict the special testnet minimum difficulty rule so that it can't be triggered on the block right before a difficulty retarget". That would also fix the issue but I find my suggestion here a bit more elegant.